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Abstract. 


The human alphacoronaviruses HCoV-NL63 and HCoV-229E are commonly associated with upper respiratory tract 
infections (URTD. Information on their molecular epidemiology and evolutionary dynamics in the tropical region of 
southeast Asia however is limited. Here, we analyzed the phylogenetic, temporal distribution, population history, 
and clinical manifestations among patients infected with HCoV-NL63 and HCoV-229E. Nasopharyngeal swabs 
were collected from 2,060 consenting adults presented with acute URTI symptoms in Kuala Lumpur, Malaysia, 
between 2012 and 2013. The presence of HCoV-NL63 and HCoV-229E was detected using multiplex polymerase 
chain reaction (PCR). The spike glycoprotein, nucleocapsid, and /a genes were sequenced for phylogenetic 
reconstruction and Bayesian coalescent inference. A total of 68/2,060 (3.3%) subjects were positive for human 
alphacoronavirus; HCoV-NL63 and HCoV-229E were detected in 45 (2.2%) and 23 (1.1%) patients, respectively. A 
peak in the number of HCoV-NL63 infections was recorded between June and October 2012. Phylogenetic 
inference revealed that 62.8% of HCoV-NL63 infections belonged to genotype B, 37.2% was genotype C, while all 
HCoV-229E sequences were clustered within group 4. Molecular dating analysis indicated that the origin of HCoV- 
NL63 was dated to 1921, before it diverged into genotype A (1975), genotype B (1996), and genotype C (2003). The 
root of the HCoV-229E tree was dated to 1955, before it diverged into groups 1-4 between the 1970s and 1990s. 
The study described the seasonality, molecular diversity, and evolutionary dynamics of human alphacoronavirus 
infections in a tropical region. 


INTRODUCTION 


Human coronaviruses were first reported in the mid-1960s and are known to be associated 
with acute upper respiratory tract infections (URTI) or the common cold.'* According to the 
International Committee for Taxonomy of Viruses , human coronavirus NL63 (HCoV-NL63) 
and 229E (HCoV-229E) belong to the alphacoronavirus genus, a member of the Coronaviridae 
family. Coronaviruses are positive-strand RNA viruses with the largest genome of approximately 
27-31 kb in size.* In previous studies, analysis of the spike (S) glycoprotein, nucleocapsid (N), 
and Ja genes of HCoV-NL63 and HCoV-229E revealed evidence of genetic recombination, 
genetic drift, and positive selection events as part of the evolution of the virus.” . 
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Phylogenetically, HCoV-NL63 and HCoV-229E are more closely related to each other than to 
any other human coronavirus.’ 


HCoV-NL63 and HCoV-229E account for about 5% of all acute URTI,’? and in some cases, 
a small proportion of infections are associated with hospital admission.'*'! URTI symptoms such 
as cough and sore throat are often observed in patients infected with either HCoV-NL63 or 
HCoV-229E.'”'? The prevalence of HCoV-NL63 varies from one study to another; however, in 
most temperate and tropical countries, it appears to peak around September—April, whereas 
HCoV-229E is usually detected at low rates throughout the year.'*'° In spite of the clinical 
importance of HCoV infections,'’ the prevalence, seasonality, clinical, and phylogenetic 
characteristics of HCoVs remain mostly unreported from the tropical region of southeast Asia. 
On the basis of the S, N, and Ja genes of the HCoV-NL63 and HCoV-229E sequences from 
Malaysia and also worldwide, we describe the genetic history and phylodynamic profiles of both 
human alphacoronaviruses using a set of phylogenetic tools. 


MATERIALS AND METHODS 


Ethics statement. 


The study was approved by the University of Malaya Medical Ethics Committee 
(MEC890.1). Standard, multilingual consent forms permitted by the Medical Ethics Committee 
were used. Written consent was obtained from all study participants. 


Clinical specimens. 


A total of 2,060 consenting outpatients who presented with acute URTI symptoms were 
recruited at the Primary Care Clinic of University Malaya Medical Center in Kuala Lumpur, 
Malaysia, between March 2012 and February 2013. Demographic data such as age, gender, and 
ethnicity were acquired before the collection of nasopharyngeal swabs. The severity of the URTI 
symptoms (sneezing, nasal discharge, nasal congestion, headache, sore throat, voice hoarseness, 
muscle ache, and cough) was graded according to criteria described earlier.'*7' The 
nasopharyngeal swabs were transferred to the laboratory in universal transport media and stored 
at —80°C. 


Molecular detection of HCoV-NL63 and HCoV-229E. 


Extraction of total nucleic acids from the nasopharyngeal swabs was carried out using the 
magnetic bead—based protocols applied in the NucliSENS easyMAG automated nucleic acid 
extraction system (BioMérieux).”*** The presence of respiratory viruses in specimens was 
examined using the xTAG Respiratory Virus Panel FAST multiplex reverse transcriptase 
polymerase chain reaction (RT-PCR) assay (Luminex Molecular Diagnostics), which can 
identify HCoV-NL63, HCoV-229E, HCoV-OC43, HCoV-HKU1, and other respiratory viruses 
and subtypes." 


Genetic analysis of HCoV-NL63 and HCoV-229E. 


Gene fragment sequencing of the S (S1 domain), complete N, and partial Ja (nsp3) genes was 
performed for HCoV-NL63 and HCoV-229E specimens. The S1 is a highly variable receptor- 
binding domain, whereas the N and nsp3 are conserved regions within the coronavirus genome, 
and these three regions are therefore efficiently used for genotyping.”® Viral RNA was reverse 


transcribed into complementary DNA (cDNA) using the SuperScript III kit (Invitrogen) with 
random hexamers (Applied Biosystems). The partial S gene (S1 domain) (HCoV-NL63: 1,383 nt 
[20,413—21,796] and HCoV-229E: 855 nt [20,819-21,674]), complete N gene (HCoV-NL63: 
1,133 nt [26,133—27,266] and HCoV-229E: 1,330 nt [25,673—27,003]), and partial 7a (nsp3) 
gene (HCoV-NL63: 781 nt [5,811—6,592] and HCoV-229E: 766 nt [5,898—6,664]) were 
amplified through PCR using 10 uM of newly designed or previously published primers listed in 
Table 1. The PCR mixture (25 uL) contained cDNA, PCR buffer (10 mM Tris-HCl [pH 8.3], 50 
mM KCl, 3 mM MgCl, and 0.01% gelatin), 100 uM (each) deoxynucleoside triphosphates, Hi- 
Spec additive and 4 U/uL BIO-X-ACT Short DNA polymerase (BioLine). The cycling 
conditions were as follows: initial denaturation at 95°C for 5 minutes followed by 40 cycles of 
94°C for 1 minute, 54.5°C for 1 minute, 72°C for 1 minute, and a final extension at 72°C for 10 
minutes. PCR reactions were performed in a C1000 Touch automated thermal cycler (Bio-Rad). 
Nested/semi-nested PCR was performed if necessary, under the same cycling conditions at 30 
cycles. Purified PCR products were sequenced using the ABI PRISM 3730XL DNA Analyzer 
(Applied Biosystems). The nucleotide sequences were codon aligned with relevant complete and 
partial HCoV-NL63 and HCoV-229E reference sequences retrieved from the GenBank.” ere 


Maximum clade credibility (MCC) trees for the partial S (S1 domain), complete N, and 
partial 7a (nsp3) genes were reconstructed in BEAST (version 1.7).°* MCC trees were produced 
using a relaxed molecular clock, assuming uncorrelated lognormal distribution under the general 
time-reversible nucleotide substitution model with a proportion of invariant sites (GTR+I) and a 
constant coalescent/exponential tree model. The Markov chain Monte Carlo run was set at 6 x 
10° steps long sampled every 10,000 state. The trees were annotated using Tree Annotator 
program included in the BEAST package, after a 10% burn-in, and visualized in FigTree 
(version 1.3.1).°° The evolutionary history and divergence time (in calendar year) for the HCoV- 
NL63 and HCoV-229E genotypes were also assessed. The mean divergence time and the 95% 
highest posterior density regions were evaluated. The best-fitting model was determined by the 
Bayes factor using marginal likelihood analysis implemented in Tracer (version 1.5 ).? The 
substitution rate of 3.3 x 10“ substitutions/site/year for the S gene of human alphacoronavirus 
estimated previously was used for the divergence time inference.” 


Maximum likelihood (ML) phylogenetic trees were also reconstructed for the three regions in 
the phylogenetic analysis using parsimony (PAUP 4.0) software,° * with a Hasegawa-—Kishino— 
Yano nucleotide substitution model plus discrete gamma categories. The statistical robustness 
and reliability of the branching orders were evaluated by a bootstrap analysis of 1,000 replicates. 
To investigate the genetic relatedness among the HCoV-NL63 and HCoV-229E genotypes, inter- 
genotype pairwise nucleotide distances were estimated for the S gene using MEGA 5.1.°° Such 
analysis was not implemented for the N and /a genes due to their high genetic invariability 
across HCoV-NL63 and HCoV-229E genotypes.”® 


Statistical analysis. 


All categorical variables were analyzed using the two-tailed Fisher’s exact test/y” test by the 
Statistical Package for the Social Sciences (release 16.0; IBM Corp., Chicago, IL). P values < 
0.05 were considered significant. 


Nucleotide sequences. 


HCoV-NL63 and HCoV-229E nucleotide sequences produced in the study have been 
deposited in GenBank under the accession nos. KT359730-KT359913. 


RESULTS AND DISCUSSION 


Detection of HCoV-NL63 and HCoV-229E in nasopharyngeal swabs. 


In the current cross-sectional study, a total of 2,060 nasopharyngeal swab specimens 
collected from Kuala Lumpur, Malaysia, throughout a 12-month study period (March 2012 to 
February 2013), were screened for the presence of HCoV-NL63 and HCoV-229E using the 
multiplex RT-PCR method, as an alternative approach to other detection methods such as cell 
culture.*° Human alphacoronavirus was identified in 68 (3.3%) subjects; HCoV-NL63 and 
HCoV-229E were detected in 45/2,060 (2.2%) and 23/2,060 (1.1%) patients, respectively. These 
findings are consistent with the global average prevalence of human alphacoronavirus, which 
ranges between 1% and 10%, with HCoV-229E generally detected at lower rates than HCoV- 
NL63.*'°°737° Th contrast to an earlier study,*' no coinfection of alpha- and betacoronavirus 
(HCoV-OC43 and HCoV-HKU1) was observed within an individual. Age, gender, and ethnicity 
of the patients were summarized in Table 2. A peak in the number of HCoV-NL63 infections 
was recorded for the period between June and October 2012, although the number of patients 
with URTI symptoms screened during those months was relatively low (Figure 1). This pattern 
of virus prevalence corroborates with that observed in neighboring country Thailand, in which a 
peak of HCoV-NL63 incidence was recorded in September.'* In contrast, studies from temperate 
regions commonly reported a higher prevalence of HCoV-NL63 during winter seasons.’ ”** 
However, the number of HCoV-229E infections detected in Malaysia was low, with no 
significant peak observed throughout the year, similar to other studies reported worldwide. 
It is important to note that the study was performed in a relatively short duration, therefore 
limiting the epidemiological and disease trend comparison with reports from other countries. 


14,38,43 


Phylogenetic analysis of the S, N, and Ja genes. 


A total of 42/45 (93.3%) partial S (S1 domain) and 43/45 (95.6%) of each complete N and 
partial 7a (nsp3) genes were successfully sequenced from HCoV-NL63 specimens. 
Amplification of these genes was difficult for two x TAG-positive HCoV-NL63 specimens, 
possibly due to their low viral copy number. Phylogenetic analysis of HCoV-NL63 (Figure 2 and 
Supplemental Figure 1) showed that 27 subjects (27/43, 62.8%) in the study belonged to 
genotype B (supported by a posterior probability of 1.0 and bootstrap value of 100% at the 
internal nodes of the MCC and ML trees of the S gene, respectively, with an intra-group pairwise 
genetic distance of 0.6% + 0.1%) together with previously reported sequences from the United 
States, Europe, and Asia.””*”*”? Another 16 subjects (16/43, 37.2%) were found to be grouped 
under genotype C (supported by a posterior probability of 1.0 and bootstrap value of 67% at the 
internal nodes of the MCC and ML trees of the S gene, respectively, with an intra-group pairwise 
genetic distance of 0.2% + 0.1%) with recently described global sequences.”>**” Discordance in 
phylogenetic clustering among the S, N, and Ja genes of the HCoV-NL63 Malaysian sequences 
had been observed (Supplemental Figure 1). On the basis of the S (Sldomain) gene analysis, 26 
Malaysian strains (26/42; 61.9%) belong to genotype B while another 16 Malaysian strains 
(16/42; 38.1%) were classified within genotype C. In contrast, sequences of the three HCoV- 
NL63 genotypes (A, B, and C) appear to be intermingled in the N and /a phylogenetic trees. 


Such discordance was similarly reported in earlier studies where it was confirmed that such 
phylogenetic pattern was resulted from multiple recombination events along the HCoV-NL63 
genome, in addition to the fact that the S1 region sequenced in this study is considered the most 
variable along the genome, while the N and Ja (nsp3) genes are too conserved.” To estimate the 
genetic diversity between HCoV-NL63 genotypes A, B, and C, inter-genotype pairwise genetic 
distance was assessed for the S gene (Table 3). Genetic distances between genotypes A versus B 
and B versus C were high (more than 5.0%), compared with that between genotypes A versus C, 
which was at 2.1%. This is consistent with the phylogenetic tree topology in which genotypes A 
and C were more closely related and probably shared a common ancestor. 


At least one gene (S, N, and/or Ja) was successfully sequenced from 23 positively tested 
HCoV-229E specimens (16, 18, and 22 of HCoV-229E S, N, and Ja genes, respectively). 
Phylogenetic analysis revealed that all of the HCoV-229E sequences obtained in this study were 
classified with group 4, which includes isolates that have been globally circulating since 2001 
(Figure 3 and Supplemental Figure 2).°393! The group was supported by a posterior probability 
of 1.0 and bootstrap value of 100% at the internal nodes of the MCC and ML trees of the S gene, 
respectively, with an intra-group pairwise genetic distance of 0.3% + 0.1%. Such phylogenetic 
data were comparable to those obtained from the WN tree, resulted from the hot substitution spots 
in the S1 and N regions of the HCoV-229E genome.” The four HCoV-229E groups could not be 
clearly defined within the /a gene tree because of the limited number of reference sequences 
available in the public database (Supplemental Figure 2). Inter-genotype pairwise genetic 
distance was generally low (below 5.0%) in the S gene among groups 1—4 (Table 3). 


Estimation of divergence times. 


The molecular clock analysis of HCoV-NL63 and HCoV-229E was performed using the 
coalescent-based Bayesian relaxed molecular clock under the constant and exponential tree 
models (Figures 2 and 3). The mean evolutionary rates for the S gene of HCoV-NL63 and 
HCoV-229E were newly estimated based on the constant tree model at 4.3 x 107* (2.3-6.7 x 
10“) and 3.9 x 10°* (1.3-6.4 x 10“) substitutions/site/year, respectively. These results were 
similar to the previously reported substitution rate of the alphacoronavirus S gene (3.3 x 10° 
substitutions/site/year).” The evolutionary analysis indicated that the time of the most recent 
common ancestor (t(MRCA) of HCoV-NL63 was dated back to the 1920s, while the estimated 
divergence time of genotype A was dated to 1975, followed by genotype B around 1996 and 
genotype C in 2003 (Figure 2). Furthermore, the divergence time of HCoV-229E (Figure 3) was 
estimated around 1955 while the tMRCA of group | diverged in 1976, followed by that of group 
2 in 1981, group 3 in 1989, and group 4 in 1996. The appearance of groups 1-4 in a timely 
ordered manner would give strength to the earlier reported hypothesis that positive selection and 
genetic drift play a major role in the evolution of HCoV-229E.°”” To the best of our knowledge, 
this is the first study that reported the divergence times of human alphacoronavirus genotypes. In 
addition, the most recently reported HCoV-229E strains (between 2001 and 2013) from major 
parts of the world belong to group 4. In accordance with earlier studies, genotype replacement is 
evident within HCoV-229E, although sampling bias may also influence the results.°*” Bayes 
factor analysis showed insignificant differences (Bayes factor less than 3.0) between the constant 
and exponential coalescent models of demographic analysis, in which the divergence times 
estimated using the constant coalescent tree model were similar to those calculated using the 
exponential model (Supplemental Table 1). 


Clinical symptoms assessment. 


Clinical findings of the URTI symptoms (sneezing, nasal discharge, nasal congestion, 
headache, sore throat, hoarseness of voice, muscle ache, and cough) and their severity levels 
(none, moderate, and severe) were analyzed using the two-tailed Fisher’s exact test. The 
association between symptom severity and HCoV-NL63/HCoV-229E infection was insignificant 
(P values > 0.05) (Supplemental Table 2). In line with previous clinical studies,'°“* the 
majority of patients infected with HCoV-NL63 and HCoV-229E presented with at least one 
respiratory symptom that was moderately severe. 


In summary, this study provides insight into the phylogeny and evolution of the HCoV-NL63 
and HCoV-2293E genotypes. Genetic characterization of human alphacoronavirus isolates 
currently circulating in Malaysia indicates the circulation of globally prevalent genotypes in the 
tropical region of southeast Asia. This study has detailed the genetic history of HCoV-NL63 and 
HCoV-229E genotypes. Since alphacoronavirus evolve through recombination, positive 
selection, and genetic drift events, continuous molecular surveillance of human alphacoronavirus 
is warranted to keep track on the evolution of the virus in southeast Asia. 
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FIGURE 1. Annual distribution of HCoV-NL63 and HCoV-229E among adults with acute upper respiratory tract 
infections in Kuala Lumpur, Malaysia. The total number of nasopharyngeal swabs screened and the monthly 
distribution of HCoV-NL63 and HCoV-229E between March 2012 and February 2013 were presented. 


FIGURE 2. Maximum clade credibility tree of HCoV-NL63. Spike gene (S1 domain) sequences (1,383 nt) were 
analyzed under the relaxed molecular clock with a GTR+I substitution model and a constant size coalescent model 
implemented in BEAST. Posterior probability values and the estimation of the time of the most recent common 
ancestors with 95% highest posterior density were indicated on major nodes. The HCoV-NL63 sequences obtained 
in this study were color coded and HCoV-NL63 genotypes A-C were indicated; green = genotype A, blue = 
genotype B, and red = genotype C. The recombinant genotype is indicated by purple color. The sampling site for 
each sequence was indicated by codes for the representation of countries. Country codes are as follows; MY = 
Malaysia; US = United States; JP = Japan; NL = Netherlands; CN = China. This figure appears in color at 
www.ajtmh.org. 


FIGURE 3. Maximum clade credibility tree of HCoV-229E. Spike gene (S1 domain) sequences (855 nt) were 
analyzed under the relaxed molecular clock with a GTR+I substitution model and a constant size coalescent model 
implemented in BEAST. Posterior probability values and the estimation of the time to the most recent common 
ancestors with 95% highest posterior density were indicated on the major nodes. The HCoV-229E sequences 
obtained in this study were color coded and the HCoV-229E groups 14 were indicated green = genotype 1, red = 
genotype 2, blue = genotype 3, and purple = genotype 4. The sampling site for each sequence was indicated by 
codes for the representation of countries. Country codes are as follows; MY = Malaysia; US = United States; JP = 
Japan; NL = Netherlands; CN = China; AU = Australia; IT = Italy. This figure appears in color at www.ajtmh.org. 


TABLE 1 


Polymerase chain reaction primers for HCoV-NL63 and HCoV-229E 


Target gene HCoV Primer Location* Sequence (5'—3') Reference 
SPIF 20,390-20,412 Forward: TGAGTTTGATTAAGAGTGGTAGG - 
NL63 SP2F 20,397-20,418 Forward (nested): GATTAAGAGTGGTAGGTTGTTG = 
SPIR 21,809-21,828 Reverse: CAAACTGCAAGTGCTCACAC 
Spike (5) SP2R 21,797-21,816 Reverse (nested): GCTCACACTGCAACTTTTCA 2 
LPS1 20,732-20,751 Forward: AATAATTGGTTCCTTCTAAC 
229 JH1 20,797—20,818 Forward (nested): TTTGTTGCTTAATTGCTTATGG ee 
LPR 21,710—21,728 Reverse: AACATACACTGCCAAATTT This study 
JH2 21,675—21,694 Reverse (nested): TTTGCCAAAAGAAAAAGGGC 
aN-F 26,102—26,127 Forward: ARRTTGCTTCATTTWWTCTAA This study 
NL63 and 
200K 25,652—25,672 
aN-Fn 26,112-26,132 Forward (nested): ATTTWWTCTAAACTAAACRAA This study 
Nucleocapsid (NV) NL63 NL-NR 27,278-27,299 Reverse: ATAATAAACAKTCAACTGGAAT This study 
NL-NRn 27,267-27,287 Reverse (nested): CAACTGGAATTACAAAACAAT This study 
229K E-NR 27,046—27,063 Reverse: GATCCTTGTCAAGCCAAA This study 
E-NRn 26,882—26,900 Reverse (nested): AAAATTCCAACTAAAGCCT This study 
SS5852-5Pf 5,778—5,798 Forward: CTTTTGATAACGGTCACTATG ca 
NL63 P3E2-5Pf 5,789-5,810 Forward (semi-nested): GGTCACTATGTAGTTTATGATG 2 
NL-1laR 6,593-6,616 Reverse: CTCATTACATAAAACATCRAACGG This study 
E-laF 5,865—5,585 Forward: CTGTTGAYAAAGGTCATTATA This study 
229E E-laFn 5,876—-5,897 Forward (semi-nested): GGTCATTATACTGTTTATGAYA This study 
E-laR 6,665—-6,688 Reverse: TTCATCACAAATAACATCAAATGG This study 


* Nucleotide location was determined based on the HCoV-NL63 (NC_005831) and HCoV-229E (NC_002645) 
reference sequences. 


TABLE 2 


Demographic data of 68 adult outpatients infected with human alphacoronavirus in Kuala Lumpur, Malaysia, 2012— 


2013 
Factor HCoV-NL63 (N = 45) HCoV-229E (N = 23) P value 
Gender 
Male 25 (55.6%) 12 (52.2%) 0.80 
Female 20 (44.4%) 11 (47.8%) 
Age 
< 40 13 (28.9%) 7 30.4%) 0.45 
40-60 10 (22.2%) 8 (34.8%) 1 
> 60 22 (48.9%) 8 (34.8%) 
Symptoms 
Sneezing 42 (93.3%) 20 (87.0%) 
Nasal discharge 38 (84.4%) 19 (82.6%) 
Nasal congestion 29 (64.4%) 15 (65.2%) 
Headache 23 (51.1%) 13 (56.5%) 0.99 
Sore throat 32 (68.9%) 14 (60.9%) 
Hoarseness of voice 35 (77.8%) 15 (65.2%) 
Muscle ache 27 (60.0%) 16 (69.6%) 
Cough 43 (95.6%) 20 (87.0%) 
Ethnicity 
Malay 11 (24.5%) 5 (21.8%) 
Chinese 24 (53.3%) 7 (30.4%) ee 
Indian 10 (22.2%) 11 (47.8%) 
TABLE 3 
The genetic diversity among alphacoronavirus genotypes in the spike gene 
HCoV-NL63 A B C 
A - 0.8 0.5 
B 7.6 _ 0.6 
C 2.1 6.7 = 
HCoV-229E 1 2 3 4 
1 = 0.4 0.6 0.7 
2 1.5 - 0.3 0.4 
3 2.5 122, - 0.3 
4 3.5 2.6 1.5 = 


* Pairwise genetic distances are expressed in percentage (%) of nucleotide difference. 


+ Standard error estimates of the mean genetic distances are shown in the upper diagonal. 


SUPPLEMENTAL FIGURE |. Phylogenetic analysis of the HCoV-NL63 spike, nucleocapsid, and /a genes. The partial 
spike (S1) (1,383 nt), complete nucleocapsid (1,133 nt), and partial Ja (nsp3) (781 nt) maximum likelihood trees 
were constructed using the Hasegawa—Kishino—Yano nucleotide substitution model and gamma distribution plus 
discrete gamma categories in phylogenetic analysis using parsimony. The HCoV-NL63 strains obtained from this 
study were color coded and the HCoV-NL63 genotypes A-C were indicated; green = genotype A, blue = genotype 
B, and red = genotype C. The recombinant genotype is indicated by purple color. Scale bars indicating genetic 
distance (in nucleotide substitutions per site) are shown. Each HCoV-NL63 sequence was assigned to its genotype 
based on the S1 phylogenetic analysis. Country codes are as follows; MY = Malaysia; US = United States; JP = 
Japan; NL = Netherlands; CN = China. 


SUPPLEMENTAL FIGURE 2. Phylogenetic analysis of the HCoV-229E spike, nucleocapsid, and /a genes. The partial 
spike (S1) (855 nt), complete nucleocapsid (1,330 nt), and partial 1a (nsp3) (766 nt) maximum-likelihood trees were 
constructed using the Hasegawa—Kishino—Yano nucleotide substitution model and gamma distribution plus discrete 
gamma categories in phylogenetic analysis using parsimony. The HCoV-229E strains obtained from this study were 
color coded and the HCoV-229E groups 1-4 were indicated; green = genotype 1, red = genotype 2, blue = genotype 
3, and purple = genotype 4. Scale bars indicating genetic distance (in nucleotide substitutions per site) are shown. 
Each HCoV-229E sequence was assigned to its genotype based on the S1 phylogenetic analysis. Country codes are 
as follows; MY = Malaysia; US = United States; JP = Japan; NL = Netherlands; CN = China; AU = Australia; IT = 
Italy. 


SUPPLEMENTAL TABLE 1 


Evolutionary characteristics of HCoV-NL63 and HCoV-229E genotypes 


Subtype-gene evolutionary rate* Genotype tMRCAT 

NL63-Spike 4.3 x 10+ (2.1 — 6.6 x 10°) 

All genotypes 1,902.2 (1,805.4—1,974.4) 

Genotype A 1,973.9 (1,961.2—1,983.8) 

Genotype B 1,995.6 (1,989.7—2,000.2) 

Genotype C 2,003.0 (1,998.6—2,006.5) 
229E-Spike 3.9 x 104 (1.3 — 6.5 x 10) 

All groups 1,956.8 (1,948.4—1,962.0) 

Group | 1,976.6 (1,973.7—1,978.9) 

Group 2 1,981.1 (1,979.6—1,982.0) 

Group 3 1,989.0 (1,987.4—1,990.0) 

Group 4 1,996.3 (1,993.0-1,999.0) 


* Estimated mean rates of evolution expressed as 10“ nucleotide substitutions/site/year under a relaxed molecular 
clock with GTR+I substitution model and an exponential tree model. The 95% highest posterior density (HPD) 
confidence intervals are included in parentheses. 


+ Mean time of the most common ancestor (t(MRCA, in calendar year). The 95% highest posterior density 
confidence intervals are indicated. 


SUPPLEMENTAL TABLE 2 


Comparison of upper respiratory tract infection symptoms severities between patients infected with HCoV-NL63 
and HCoV-229E 


Symptom Severity level HCoV-NL63 HCoV-229E P value* 
None 3 3 
Sneezing Moderate 35 15 0.472 
Severe 7 5 
None 7 4 
Nasal discharge Moderate 33 11 0.051 
Severe 5 8 
None 16 8 
Nasal congestion Moderate 24 11 0.727 
Severe ) 4 
None 22 10 
Headache Moderate 17 8 0.696 
Severe 6 5 
None 12 9 
Sore throat Moderate 26 9 0.269 
Severe 6 5 
None 10 8 
Hoarseness of voice Moderate 34 13 0.172 
Severe 1 2 
None 17 7 
Muscle ache Moderate 20 15 0.252 
Severe 7 1 
None 2 3 
Cough Moderate 32 15 0.477 
Severe 11 5 


* P values < 0.05 represent significant results. 
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229E-24/6/90 (DQ243973) 
229E-USA/932-72/1993 (KF514432) 7 
229E/USA/933-50/1993 (KF514430) 
229E-5/9/84 (DQ243972) 

229E/29/7/84 (DQ243971) 

229E/21/10/82 (DQ243968) 

229E/6/10/82 (DQ243967) 

229E/8/11/82b (DQ243970) 

229E/8/1/182a (DQ243969) 

229E/16/6/82 (DQ243965) 

229E/22/9/82 (DQ243966) 

229E/11/6/79 (DQ243964) 

229E/ Reference (NC_002645) 


JW 


LIL 


US 


51 


US 


MY 


JP IT AU CN NLCNJPCN 


AU 


AU 


US 


AU 


B 


HCoV-229E 
Nucleocapsid gene 


1330nt 


229E-5/9/84 (DQ243947) 
229E-8/11/82b (DQ243946) 
229E-8/11/82a (DQ243945) 
229E-16/6/82 (DQ243942) 
229E-11/6/79 (DQ243940) 
229E-22/9/82 (DQ243941) 


Cc 

HCoV-229E 
1a (nsp3) gene 
766nt 


97 


883 
229E10349/10 (JX503060) 
229E/N0304/10 (JX503061) 
229F-201/04 (00243962) 
229E-25/8/03 (00243961) 
229E-24/4/03 (DQ243956) 
229E-28/2/03 (D0243960) 

229E--14/8/03 (00243959) 
64] 229€-19/8/03 (00243958) 
229E- 30/7103 (00243957) 
229E-6/1/03 (00243955) 
229E-2718/01 (DQ243954) 
29E-818/01 (00243953) 
229E-2516/92 (00243952) 
229E-1716/92 (00243951) 
229E-12/5/92 (D0243950) 
229EIUSAI933-40/199 (KF514433) 
22SE-USA/932-72/1998 (KF514432) 
22SEIUSA/933-50/1998 (KF514430) 
229E-24/6/90 (00243949) 
229EIUSA'892-11/1989 (KF514429) 
229E- 29/7184 (00243948) 


229E-21/10/82 (DQ243944) 
229E-6/1/0 (DQ243943) 


2296] Reference (NC_002645) 


‘229E10349/10 (JX503060) 
£2206140804/t0 (4X603061) 


229E/USAI933-4011998 (KF514433) 


1229E-USA/932-72/1998 (KF514432), 


229E/USA933-50/1993 (KF514430) 


2296! Reference (NC_002645) 


IT NL 


MY 


US 


